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A new class ol probabilistic models for cascading failure propagation in interconnected systems 
is proposed. The models are able to represent important physical characteristics of realistic load- 
redistribution mechanisms, e.g., that the load increments after a failure depend on the load of the 
failing element and that they may be distributed non-uniformly among the remaining elements. In 
the limit of large system sizes, the models are solved analytically in terms of generalized branching 
processes, and the failure propagation properties of a prototype example are analyzed in detail. 



PACS numbers: 89.20.-a, 89.75.-k, 02.50.Ey 



I. INTRODUCTION 

The increasing complexity of todays infrastructure net- 
works, e.g., electrical power grids, road systems, or com- 
munication networks, makes them very sensitive to local 
failures [H-dl- When an element in such a network fails, 
its "load" (e.g., power, traffic, or information flow) is re- 
distributed to the other elements of the network. Some 
of the increased loads may then exceed the capacity of 
their respective element, leading to further failures and 
eventually to a cascading breakdown of the entire net- 
work. Cascading failure propagation is not only observed 
in physical infrastructure networks, but also in social and 
economic systems [1] or in the fracture of heteroge- 
neous materials [1, u ! . 

As a breakdown of critical infrastructure networks can 
have serious economic consequences, it is crucial to gain 
a deeper understanding of the mechanisms that lead to 
such cascading failures. This problem has, in particular, 
attracted the interest of the statistical physics commu- 
nity, and various models have been developed to study 
the vulnerability of complex networks with respect to cas- 
cading failure propagation [H-H, Q ■ A description of the 
load-redistribution on different levels of detail has been 
considered, e.g., more physical approaches based on re- 
sistor networks Q or complex-network models focusing 
on purely topological measures like the betweenness cen- 
trality \X^,9^ 10]. The dynamics of most of these models, 
however, can only be analyzed via large-scale numerical 
simulations. In order to obtain an analytically solvable 
model, Dobson et al. [3, [HI consider the simplifying as- 
sumption that the load increments after a failure are the 
same for all remaining elements and independent of the 
failing load. Similarly, fiber bundle models for the prob- 
lem of fracture propagation 0, [H, [3] can only be 
solved analytically if the load of the failing fiber is equally 
redistributed to all remaining fibers. 

In this paper, we introduce and analyze a new class 
of probabilistic models for cascading failure propaga- 
tion that can represent, in a stochastic sense, impor- 
tant characteristics of realistic load- redistribution mecha- 
nisms: The load redistribution after a failure is no longer 
assumed to be uniform and the induced load increments 
may depend on the load of the failing element. With 



such models, we can thus expect to obtain a better un- 
derstanding of the breakdown processes in real networks. 
We show that in the limit of large system sizes, our 
models can be solved analytically by using a Markov 
approximation and the theory of generalized branching 
processes [l3|. We then apply our general approach to 
an illustrative prototype system that roughly imitates 
failure propagation in a power transmission network and 
analyze its vulnerability with respect to cascading break- 
down. 



II. CASCADING-FAILURE MODEL 

We consider a system consisting of N elements, each 
with a random load L > 0. The loads are assumed to 
be independent of each other and identically distributed. 
Furthermore, every element possesses a random critical 
load L'^^^ above which it will fail. Whereas we assume 
that the critical loads of the various elements are inde- 
pendent of each other, we allow for possible correlations 
between the initial and critical loads of a particular ele- 
ment. Specifically, we require that initially none of the 
elements is overloaded, i.e., the probability P{L > i™^'^) 
vanishes. 

We now consider a situation where, due to some ex- 
ternal influence, one of the elements, say with load Lf, 
fails. Our central model assumption is that this load is 
redistributed to the remaining elements according to the 
stochastic load-redistribution rule 



L' = L + L[A. 



(1) 



Here, L [L') is the load of one of the remaining elements 
before (after) failure of the element with load Lf , and the 
load-redistribution factor A is a random number drawn 
independently from the same distribution for each of the 
remaining elements. In other words: The load increments 
are proportional to the failed load Lf, but with random 
proportionality factors A. 

The form of rule ([1]) is based on the observation that 
in many systems, the load-redistribution factors primar- 
ily depend on structural properties, such as interactions 
between the various elements, and not on the load of the 
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failing element. In a more "microscopic" approach, the 
failure dynamics of such systems would be described by a 
model of the form ([T]), but with the factor A being deter- 
mined by the specific interactions of the failing element 
with the one affected by the failure. Corresponding ex- 
amples range from the power-flow redistribution after a 
line failure in power grids (T6j to the distance-dependent 
stress redistribution in fiber bundles [isj . The main fea- 
tures of a load redistribution of the form ^ can already 
be understood by considering the extreme cases of a uni- 
form, global load redistribution and a purely local one. In 
the former case, each element is affected in the same way 
and thus A = l/(iV — 1). The latter situation is described 
by A = 1/Z for the Z nearest neighbors of the failing 
element and zero otherwise. The stochastic load redis- 
tribution rule ([l} models the microscopic A-dependence 
in terms of a noisy dynamics that neglects any spatial 
correlations. While its specific form thus depends on the 
system at hand — we will consider an example in Sect. IIVI 
below — we expect two properties to be generally fulfilled; 
(i) On average, the failed load will be redistributed to the 
remaining N — 1 elements. This implies that the mean 
(A) behaves as 1/N for large N. (ii) The A-distribution 
typically will be bounded. For instance, if — in the worst 
case — one single element has to take over the load of the 
failing element, one has |A| < 1. 

So far, we have only discussed the load redistribution 
after an initial failure. Obviously, it can happen that 
the post-failure loads of a number N^'^'^ > 1 of elements 
are above their respective critical loads. In such a situ- 
ation, a failure cascade develops. For its description, we 
assume that the overloaded elements fail simultaneously 
and that each failing load is redistributed to the remain- 
ing elements according to rule ([T|) [l^ . If this redistribu- 
tion results in further overloading, the cascade continues 
to a new cascade stage. This process continues until the 
system either reaches a stable state, i.e., the remaining 
elements operate within their bounds, or all N elements 
have failed and the system has broken down completely. 

Denoting the number of failures at each cascade 

(s) 

stage s — 1,2, . . . hy Nf and counting the initial fail- 
ure as N, 



(0) 



iVf 



1, the total number of failed elements 

^j,^Q iVf provides a measure for the damage 
to the system. The distribution of this random vari- 
able characterizes the system stability. Coarsely, two 
regimes can be distinguished: (i) The probability of large 
N{ decays quickly, i.e., at least exponentially, and thus 
system-wide cascades with N{ < N constitute very rare 
events; (ii) System-wide failures occur with finite prob- 
ability even for iV — oo. At the separation between 
these two regimes, the system exhibits a "critical" be- 
havior Q, where large-scale events are still suppressed 
but their probability only decays according to a power 
law: P{Nf) oc N^'^ for N ^ oo. 

To determine the stability of a given system with re- 
spect to cascading failures, the detailed form of the prob- 
ability distribution P{N{) is not required and will not be 



evaluated in the present paper. Instead, it suffices to 
have an indicator for the two regimes just outlined. An 
obvious choice is the probability for a system- wide break- 
down: Pb — P{Ni = N). Another quantity of interest 
is the probability that an initial failure does not induce 
any further failures, in other words, the probability that 
no cascade develops at all: P^c = P{Ni — 1) 



III. GENERALIZED-BRANCHING-PROCESS 
APPROXIMATION 



In the limit of large systems, N oo, when finite-size 
effects do not play a role, an approximate description of 
the cascade dynamics can be obtained by making two 
observations: First, during a failure cascade, the distri- 
bution of the not yet failed loads can be approximated 
by their initial distribution. Thus, a Markovian descrip- 
tion in terms of the loads which fail at every cascade step 
becomes possible. The corresponding states form a point 
process on the non- negative real axis [l^. Second, as 
the number of remaining elements always stays infinitely 
large, the number of induced failures can be described 
by a Poisson distribution. This yields an approxima- 
tion of our model in terms of a generalized branching 
process which is fully defined by its characteristic 
functional 



G[u;Lf] = exp<^ /Ut(if) 



jh',p{L[\L',>L^^-;L,) 



X e 



(2) 



where u denotes an arbitrary non-negative test function 
on the interval [0, oo) and p{L[\L'f > L'^'^'';Lf) is the 
conditional probability density that a failure induced by 
a failing load Lf occurs with a load Lj . Given the joint 
distribution of initial and critical loads, as well as the dis- 
tribution of the load-redistribution factors, this quantity 
can be readily calculated. The mean number of induced 
failures is given by Atf(Lf) = (TV - 1)P(L^ > L'^^'^lLf). 
Note that in order for a meaningful limit iV — > oo to 
exist this implies that the conditional failure probability 
of a single element P{Lf > L"'^^^\Lf) has to be of or- 
der 0{1/N) (cf. the discussion above on the mean of the 
load- redistribution factors). 

For the calculation of the breakdown and no-cascade 
probabilities, we condition these quantities on the load Lf 
of the failing element. From the Poissonian distribu- 
tion of the failures induced directly by this initial fail- 
ure, one then obtains the conditional no-cascade prob- 
ability Pnc(^f) = exp[— /if (Lf)]. The conditional break- 
down probability Pb(if) can be obtained as solution of 
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the integral equation [TJl 

1 - Pb(if) = exp I - /if (if) J > L--; if) 

xP^m)}. (3) 



This relation can be interpreted in the sense that the 
probability that no complete breakdown develops after a 
failure with load Lf equals the probability that — in the 
limit N oo — none of the induced failures with load 
Lf leads to a breakdown. Starting from an initial guess 
for Pb(Lf), Eq. ([3]) can be efficiently solved by means of 
an iterative procedure . This either yields the vanish- 
ing solution P\y{L{) = if the system is immune against 
cascading failures or the unique nontrivial solution with 
finite breakdown probability. Note that the range of pos- 
sible Lf in Eq. ([3]) might be larger than that of the initial 
loads L (see the example in Sect. IIVI below). We finally 
remark that an integral relation similar to Eq. ([3]) can be 
derived for the generating function of the total number 
of failures A^f (Lf), where Lf is the initially failing load. 



IV. EXAMPLE: SIMPLE BIMODAL LOAD 
REDISTRIBUTION 

As a simple, yet prototypical example, we now consider 
a bimodal load redistribution: 



Aq with probability po 
with probability 1 — po 



(4) 



Thus, a failure affects a specific other element with proba- 
bility poi in which case this element receives a portion Aq 
of the failed load. In line with the above arguments, 
we require (A) = po Aq — 1/{N — 1) and consequently 
1/{N 1) < Aq < 1. On average, the failing load is re- 
distributed to (A^ — 1)pq ~ I/Aq other elements. In this 
sense, the model allows one to study the transition be- 
tween the above-mentioned two extreme cases of a global 
load redistribution for Aq — 1/{N—1) and a load transfer 
to a single other element, typically the nearest neighbor, 
for Ao = 1. 

For the initial loads L, we consider a uniform dis- 
tribution, which can be scaled without loss of general- 
ity to the interval [0,1]. Motivated by applications to 
infrastructure networks with cost-limited capacity, e.g., 
power transmission networks, we assume that the maxi- 
mal load of each element is higher than its initial load by 
a constant tolerance a > [l|: 



^max = (1 + 7^ . 



(5) 



In the limit iV — >■ oo, the model is thus fully characterized 
by the two parameters Aq and a and in the following, we 
shall study the stability of the system as a function of 
these parameters. 



As shown above, the no-cascade probability P^c fol- 
lows directly from the mean number of failures induced 
by a failure with load Lf. From P(Lf > L'"''''|Lf) = 
Po P{L < Lf Ao/a), where L is the initial load of an ar- 
bitrary element, we obtain /if(Lf) ~ min(l/Ao, Lf/a). 
The integral equation ^ for the conditonal breakdown 
probability Pb(Lf) assumes the form 



1 - Pb(Lf) = exp 



1 



LfAo+min(l,LfAo/Q) 

Lf ft(Lj) 

LfAn 



(6) 



It has to be solved on the interval [OjLf.max] with 
Lf,max = 1/(1 - Ao) for a < Ao/(l - Aq) and Lf^^ax = 1 
otherwise. 

If we assume that the initially failing element is cho- 
sen at random with equal probability, we obtain the total 
no-cascade and breakdown probabilities, Pnc and Pb, re- 
spectively, by integrating the corresponding conditioned 
probabilities over the range [0, 1] of possible initially fail- 
ing loads. Whereas for the breakdown probability, the 
integral has to be performed numerically from the iter- 
ative solution of Eq. the no-cascade probability can 
be obtained explicitly: 

_ Ja+(l-a-a/Ao)e-i/^« for a < Aq 
"^"\a-ae-i/" for a > Aq. ^ ' 

Figure [T] shows the probabilities Pnc and Pb as a func- 
tion of the tolerance a for various values of the redistri- 
bution factor Aq. We observe (see upper panel) that the 
no-cascade probability Pnc gradually increases from its 
minimal value exp(— I/Aq) for a = but remains con- 
siderably below one over the considered a-range. This is 
in stark contrast to the behavior of the breakdown prob- 
ability (see lower panel) , which decreases with increasing 
tolerance a to vanish completely above a certain critical 
a-value. In this latter regime, the system becomes sta- 
ble in the sense that cascading failures affecting it as a 
whole do not occur with finite probability. With increas- 
ing load-redistribution factor Aq, the transition to this 
regime happens at higher a-values and also becomes less 
sharp. Comparing with the no-cascade probabilities Pnc, 
we find that those cannot serve as a reliable indicator 
for the system stability: Consider, for example, the case 
a = 0.5, where the breakdown probability varies strongly 
with Aq, as opposed to the no-cascade probability, which 
is even independent of Aq for Aq < a. 

In Fig. [1] we also compare the results from the 
generalized-branching-process approximation with those 
obtained from a Monte-Carlo simulation of the full 
stochastic dynamics (H]) for a system consisting of iV = 
2000 elements (see symbols in Fig. [1} . Within the statis- 
tical error, we find a very good agreement, except near 
the transition to a stable system in the case of small load- 
redistribution factors Aq. In this regime, the failing load 
is distributed to a large number of elements, but not all 
of them fail immediately. Their increased load, however. 
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FIG. 1: No-cascade (upper panel) and breakdown (lower 
panel) probabilities, Pnc and Pb, respectively, as a function of 
the tolerance a for different values of the load-redistribution 
parameter Ao. Lines: Eq. ((Tjl (upper panel) and results 
from an iterative solution of Eq. ^ (lower panel). Symbols: 
Monte-Carlo results from a simulation of the stochastic dy- 
namics 111 for A'' — 2000 elements and 10'' realizations. The 
statistical error is below the size of the symbols. Inset of 
lower panel: Monte-Carlo results (from 10^ realizations) as a 
function of the system size for Ao = 0.1 and a = 0.15, as 
indicated by the arrow in the lower panel. The line serves as 
a guide to the eye. 



will eventually lead to a higher breakdown probability 
than predicted by the branching-process approximation, 
where this effect is neglected. As the number of such el- 
ements is independent of the system size, this finite-size 
effect will vanish in the limit of very large systems. As 
shown exemplarily for the case Aq = 0.1 and a = 0.15 
in the inset of Fig. [U the breakdown probability ob- 
tained from Monte-Carlo simulations indeed approaches 
zero with increasing system size N , in agreement with 
the solution obtained from Eq. 

It is also interesting to look at the behavior of 
the breakdown probability as a function of the load- 
redistribution factor Ao (see Fig. [5]). For a fixed tol- 
erance a, we find a vanishing breakdown probability Pb 
for small Aq, which corresponds to "well-connected" sys- 
tems where the failing load is redistributed to a large 
number of other elements. Above a critical Ap-value, the 
breakdown probability increases abruptly, in particular 
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FIG. 2: Breakdown robability Pb obtained from an iterative 
solution of Eq. (|6} as a function of the load-redistribution 
parameter Aq for different values of the element tolerance a. 



for small tolerances a. It reaches a maximum and then 
gradually decreases again towards zero in the limit of Aq 
going to unity, where the failing load is transferred to a 
single other element. It follows that the network is ro- 
bust against cascading breakdown if Aq is smaller than 
its Qf-dependent critical value. 

Finally, we compare our results with those of a sim- 
ple branching process model, e.g., Refs. [1, [ll|, where 
the induced load increments are independent of the load 
of the failing element. In these models, the no-cascade 
probability Pnc as well as the breakdown probability Pb 
are completely determined by a single quantity, the mean 
number //f of failures that are induced by a failing ele- 
ment. In particular, Pb is zero if < 1 and finite if 
/if > 1. When such a model is applied to our prototype 
example, the load increments after a failure are equal to 
a constant Qo with probability po and zero otherwise. It 
follows that Hi = min(l/2Q;, l/2(3o), and we note that 
for a consistent comparison with our model, Qo has to 
be identified with Ao/2. As a function of a, the break- 
down probability Pb then becomes zero at the critical 
value die — 1/2, which is independent of the value of Aq. 
In contrast to our results of Fig. [21 we thus find that 
such a model does not exhibit a critical behavior with 
respect to the parameter Aq, i.e., the breakdown prob- 
ability stays finite for arbitrarily small values of Aq if 
a < 1/2. For a > 1/2, Pb is zero for all values of Aq 
(0 < Ao < 1). 



V. CONCLUSIONS 

We have introduced and analyzed a class of stochas- 
tic failure-propagation models which, compared to pre- 
vious approaches, enable a more realistic description of 
real systems, while still being amenable to an analytical 
treatment. The approach is applied to a prototype exam- 
ple that is motivated by the propagation of line failures 
in power transmission networks. The initial loads (power 
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flows) have random values and the maximum load an el- 
ement can carry is assumed to be equal to (1 + a) times 
its initial load, Eq. ([5]). With this example, we have 
demonstrated that our model not only exhibits a criti- 
cal behavior as a function of the failure tolerance a, but 
also with respect to a parameter Aq that characterizes 
the variance of the load-redistribution factors and thus 
depends on physical as well as on topological properties 
of the load or flow dynamics. 

While our assumption of stochastic load redistribution 
neglects any spatial correlations, we are still able to gain 
new insights into the vulnerability of complex networks. 
If we use a more realistic distribution of redistribution 
factors A, our results on the critical behavior of the 
breakdown probability with respect to failure tolerance 
and connectivity, e.g., may give valuable information for 
the design of more robust infrastructure systems. 



Finally, we note that our models can not only be ap- 
plied to critical infrastructure networks, but also to other 
breakdown phenomena, e.g., to failure propagation in 
elastic fiber bundles [1, 0] • Within our approach, a cor- 
responding model (with stochastic load redistribution) is 
obtained if we assume that the initial loads are all identi- 
cal and that the critical loads of the individual elements 
are randomly distributed. A detailed study of such mod- 
els will be presented in a separate publication p7| . 
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